Tutorial 2: Phase analysis with tree search#
Dara is equipped with a parallelized tree search algorithm to identify possible phases present in a given XRD pattern.
In this tutorial, we will try to identify the phases in one experimental solid-state
reaction sample between GeO2 and ZnO.
You can download this tutorial project from here.
%pip install ipywidgets nbformat
from pathlib import Path
from dara import search_phases
pattern_path = "tutorial_data/GeO2-ZnO_700C_60min.xrdml"
# three elements are present in the sample
chemical_system = "Ge-O-Zn"
Step 1: Prepare reference phases#
Dara pre-builds an index of all the unique and low-energy phases in ICSD and COD databases. It also implements a method to download CIF structures from COD data server so that there is no need to obtain the offline database.
Before every search, we will need to gather all the reference phases in the chemical
system for the search algorithm. Dara provides ICSDDatabase and CODDatabase to do
the filtering.
In this example, we will use CODDatabase to download all the phases in the chemical system of Ge-O-Zn.
from dara.structure_db import CODDatabase
# The COD database contains methods to filter phases in the chemical system
cod_database = CODDatabase()
# gather reference phases and save them to a directory called "cifs"
all_cod_ids = cod_database.get_cifs_by_chemsys(chemical_system, dest_dir="cifs")
2024-08-15 19:46:53,392 WARNING dara.structure_db Local copy of database not found. Attempting to download structures...
2024-08-15 19:46:55,322 INFO dara.structure_db Saving downloaded CIFs to dara_downloaded_cifs
2024-08-15 19:46:55,333 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,334 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,335 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,336 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,337 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,338 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,338 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,339 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,340 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,341 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,342 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,342 INFO dara.structure_db Skipping common gas: O2
2024-08-15 19:46:55,343 INFO dara.structure_db Skipping common gas: O2
Skipping high-energy phase: 1528389 (Ge, 96): e_hull = 0.1494
Skipping high-energy phase: 9013109 (Ge, 64): e_hull = 0.3137
Skipping high-energy phase: 1525835 (GeO2, 205): e_hull = 0.2246
Skipping high-energy phase: 1533322 (Ge7O23, 215): e_hull = 0.6571
Skipping high-energy phase: 1011223 (ZnO2, 19): e_hull = 0.1674
Skipping high-energy phase: 1529590 (ZnO2, 164): e_hull = 0.4588
Skipping high-energy phase: 1534836 (ZnO, 225): e_hull = 0.1473
Successfully copied 1538108.cif to O17.28_12_(cod_1538108)-None.cif in cifs
Successfully copied 9011050.cif to Ge_227_(cod_9011050)-0.cif in cifs
Successfully copied 7101738.cif to Ge_227_(cod_7101738)-0.cif in cifs
Successfully copied 9012435.cif to Zn_194_(cod_9012435)-0.cif in cifs
Successfully copied 4030923.cif to Zn_12_(cod_4030923)-None.cif in cifs
Successfully copied 9007435.cif to GeO2_136_(cod_9007435)-0.cif in cifs
Successfully copied 1525833.cif to GeO2_60_(cod_1525833)-36.cif in cifs
Successfully copied 2104024.cif to GeO2_60_(cod_2104024)-36.cif in cifs
Successfully copied 1526227.cif to GeO2_14_(cod_1526227)-None.cif in cifs
Successfully copied 2300365.cif to GeO2_152_(cod_2300365)-0.cif in cifs
Successfully copied 8000212.cif to Ge5O11_12_(cod_8000212)-None.cif in cifs
Successfully copied 9006858.cif to GeO2_58_(cod_9006858)-6.cif in cifs
Successfully copied 9007477.cif to GeO2_154_(cod_9007477)-0.cif in cifs
Successfully copied 9015579.cif to GeO2_92_(cod_9015579)-1.cif in cifs
Successfully copied 9004178.cif to ZnO_186_(cod_9004178)-0.cif in cifs
Successfully copied 1527883.cif to ZnO2_44_(cod_1527883)-None.cif in cifs
Successfully copied 1536063.cif to Zn10.26O48_160_(cod_1536063)-None.cif in cifs
Successfully copied 1537875.cif to ZnO_216_(cod_1537875)-7.cif in cifs
Successfully copied 4517837.cif to Zn5O12_15_(cod_4517837)-None.cif in cifs
Successfully copied 1007256.cif to Zn2Ge3O8_212_(cod_1007256)-2.cif in cifs
Successfully copied 1549040.cif to Zn2GeO4_227_(cod_1549040)-None.cif in cifs
Successfully copied 1549041.cif to Zn2GeO4_95_(cod_1549041)-None.cif in cifs
Successfully copied 9014631.cif to Zn2GeO4_148_(cod_9014631)-0.cif in cifs
Since we are using a pre-filterd database (i.e., the COD), the downloaded CIF files will automatically be named according to the following convention:
{composition}_{spacegroup}_(cod|icsd_{id})-{e_hull}.cif
Where the e_hull is the energy above the convex hull in meV/atom, as determined from
the Materials Project database for the ground-state entry with matching composition and spacegroup.
Step 2: Search for phases#
After preparing the reference CIFs, we can start the phase search on a provided XRD pattern.
In this case, we are using the XRD pattern from the solid-state reaction sample
on our laboratory’s Aeris diffractometer (tutorial_data/GeO2-ZnO_700C_60min.xrdml).
# gather all the phases in the "cifs" directory
all_cifs = list(Path("cifs").glob("*.cif"))
search_results = search_phases(
pattern_path=pattern_path,
phases=all_cifs,
wavelength="Cu",
instrument_name="Aeris-fds-Pixcel1d-Medipix3",
)
2024-08-15 19:46:55,458 INFO dara.search.tree Detecting peaks in the pattern.
2024-08-15 19:47:32,982 INFO dara.search.tree The wmax is automatically adjusted to 57.88.
2024-08-15 19:47:32,984 INFO dara.search.tree The intensity threshold is automatically set to 10.00 % of maximum peak intensity.
2024-08-15 19:47:32,985 INFO dara.search.tree Creating the root node.
2024-08-15 19:47:32,986 INFO dara.search.tree Refining all the phases in the dataset.
2024-08-15 19:47:36,032 INFO worker.py:1724 -- Started a local Ray instance.
(remote_do_refinement_no_saving pid=2320) /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/spglib/spglib.py:115: DeprecationWarning: dict interface (SpglibDataset['hall_number']) is deprecated.Use attribute interface ({self.__class__.__name__}.{key}) instead
(remote_do_refinement_no_saving pid=2320) warnings.warn(
(remote_do_refinement_no_saving pid=2320) /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/spglib/spglib.py:115: DeprecationWarning: dict interface (SpglibDataset['hall_number']) is deprecated.Use attribute interface ({self.__class__.__name__}.{key}) instead [repeated 7x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/ray-logging.html#log-deduplication for more options.)
(remote_do_refinement_no_saving pid=2320) warnings.warn( [repeated 7x across cluster]
(remote_do_refinement_no_saving pid=2320) /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/spglib/spglib.py:115: DeprecationWarning: dict interface (SpglibDataset['hall_number']) is deprecated.Use attribute interface ({self.__class__.__name__}.{key}) instead [repeated 3x across cluster]
(remote_do_refinement_no_saving pid=2320) warnings.warn( [repeated 3x across cluster]
(remote_do_refinement_no_saving pid=2320) /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/spglib/spglib.py:115: DeprecationWarning: dict interface (SpglibDataset['hall_number']) is deprecated.Use attribute interface ({self.__class__.__name__}.{key}) instead [repeated 2x across cluster]
(remote_do_refinement_no_saving pid=2320) warnings.warn( [repeated 2x across cluster]
(remote_do_refinement_no_saving pid=2321) /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/spglib/spglib.py:115: DeprecationWarning: dict interface (SpglibDataset['hall_number']) is deprecated.Use attribute interface ({self.__class__.__name__}.{key}) instead [repeated 9x across cluster]
(remote_do_refinement_no_saving pid=2321) warnings.warn( [repeated 9x across cluster]
2024-08-15 19:48:14,175 INFO dara.search.tree Finished refining 16 phases, with 7 phases removed.
(_remote_expand_node pid=2321) 2024-08-15 19:48:14,212 INFO dara.search.tree Expanding node 36a89b2c-5b3f-11ef-9dc0-ef6fdd4caf38 with current phases [], Rwp = None
(remote_do_refinement_no_saving pid=2320) /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/spglib/spglib.py:115: DeprecationWarning: dict interface (SpglibDataset['hall_number']) is deprecated.Use attribute interface ({self.__class__.__name__}.{key}) instead [repeated 2x across cluster]
(remote_do_refinement_no_saving pid=2320) warnings.warn( [repeated 2x across cluster]
(_remote_expand_node pid=2321) 2024-08-15 19:48:18,641 INFO dara.search.tree Expanding node 51b8bafb-5b3f-11ef-9dc0-ef6fdd4caf38 with current phases [RefinementPhase(path=PosixPath('cifs/GeO2_152_(cod_2300365)-0.cif'), params={})], Rwp = 42.15
(remote_do_refinement_no_saving pid=2621) /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/spglib/spglib.py:115: DeprecationWarning: dict interface (SpglibDataset['hall_number']) is deprecated.Use attribute interface ({self.__class__.__name__}.{key}) instead [repeated 7x across cluster]
(remote_do_refinement_no_saving pid=2621) warnings.warn( [repeated 7x across cluster]
(remote_do_refinement_no_saving pid=2621) /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/spglib/spglib.py:115: DeprecationWarning: dict interface (SpglibDataset['hall_number']) is deprecated.Use attribute interface ({self.__class__.__name__}.{key}) instead [repeated 8x across cluster]
(remote_do_refinement_no_saving pid=2621) warnings.warn( [repeated 8x across cluster]
(_remote_expand_node pid=2621) 2024-08-15 19:48:27,138 INFO dara.search.tree Expanding node 51b8bafe-5b3f-11ef-9dc0-ef6fdd4caf38 with current phases [RefinementPhase(path=PosixPath('cifs/GeO2_152_(cod_2300365)-0.cif'), params={}), RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={})], Rwp = 22.51
(_remote_expand_node pid=2321) 2024-08-15 19:48:20,095 INFO dara.search.tree Expanding node 51b8bafd-5b3f-11ef-9dc0-ef6fdd4caf38 with current phases [RefinementPhase(path=PosixPath('cifs/Zn2GeO4_148_(cod_9014631)-0.cif'), params={})], Rwp = 56.68 [repeated 2x across cluster]
(_remote_expand_node pid=2681) 2024-08-15 19:48:32,266 INFO dara.search.tree Expanding node 58afd316-5b3f-11ef-9dc0-ef6fdd4caf38 with current phases [RefinementPhase(path=PosixPath('cifs/GeO2_152_(cod_2300365)-0.cif'), params={}), RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={}), RefinementPhase(path=PosixPath('cifs/Zn2GeO4_148_(cod_9014631)-0.cif'), params={})], Rwp = 12.04
(_remote_expand_node pid=2320) 2024-08-15 19:48:29,147 INFO dara.search.tree Expanding node 57e5639e-5b3f-11ef-9dc0-ef6fdd4caf38 with current phases [RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={}), RefinementPhase(path=PosixPath('cifs/Zn2GeO4_148_(cod_9014631)-0.cif'), params={})], Rwp = 44.26
Step 3: Result analysis#
The returned search result will be a list of SearchResult object.
search_results
[SearchResult(refinement_result=RefinementResult(lst_data=LstResult(raw_lst='Rietveld refinement to file(s) GeO2-ZnO_700C_60min.xy\nBGMN version 4.2.23, 4416 measured points, 121 peaks, 24 parameters\nStart: Thu Aug 15 19:48:27 2024; End: Thu Aug 15 19:48:29 2024\n20 iteration steps\n\nRp=9.96% Rpb=19.06% R=10.99% Rwp=12.04% Rexp=2.69%\nDurbin-Watson d=0.10\n1-rho=1.99%\n\nGlobal parameters and GOALs\n****************************\nQGeO2152cod23003650=0.4771+-0.0021\nQZnO186cod90041780=0.3870+-0.0024\nQZn2GeO4148cod90146310=0.1359+-0.0013\nEPS2=-0.002856+-0.000013\n\nLocal parameters and GOALs for phase GeO2152cod23003650\n******************************************************\nSpacegroupNo=152\nHermannMauguin=P3_121\nXrayDensity=4.276\nRphase=10.88%\nUNIT=NM\nA=0.499111+-0.000024\nC=0.564768+-0.000033\nk1=0.0100000\nB1=0.00500000\nGEWICHT=0.2793+-0.0012\nGrainSize(1,1,1)=84.1811\nAtomic positions for phase GeO2152cod23003650\n---------------------------------------------\n 3 0.4512 0.0000 0.3333 E=(GE(1.0000))\n 6 0.3974 0.3022 0.2429 E=(O(1.0000))\n\nLocal parameters and GOALs for phase ZnO186cod90041780\n******************************************************\nSpacegroupNo=186\nHermannMauguin=P6_3mc\nXrayDensity=5.669\nRphase=9.52%\nUNIT=NM\nA=0.325072+-0.000011\nC=0.520812+-0.000030\nk1=0\nB1=0.003509+-0.000094\nGEWICHT=0.2266+-0.0021\nGrainSize(1,1,1)=120.9+-3.2\nAtomic positions for phase ZnO186cod90041780\n---------------------------------------------\n 2 0.3333 0.6667 0.0000 E=(ZN(1.0000))\n 2 0.3333 0.6667 0.3821 E=(O(1.0000))\n\nLocal parameters and GOALs for phase Zn2GeO4148cod90146310\n******************************************************\nSpacegroupNo=148\nHermannMauguin=R-3\nXrayDensity=4.777\nRphase=20.28%\nUNIT=NM\nA=1.423755+-0.000083\nC=0.952849+-0.000079\nk1=0.0100000\nB1=0.00500000\nGEWICHT=0.07959+-0.00075\nGrainSize(1,1,1)=84.1811\nAtomic positions for phase Zn2GeO4148cod90146310\n---------------------------------------------\n 18 0.2150 0.1940 0.5830 E=(ZN(1.0000))\n 18 0.5483 0.8607 0.5837 E=(ZN(1.0000))\n 18 0.2150 0.1940 0.2500 E=(GE(1.0000))\n 18 0.8877 0.4633 0.4293 E=(O(1.0000))\n 18 0.2220 0.1310 0.4030 E=(O(1.0000))\n 18 0.2230 0.1140 0.7500 E=(O(1.0000))\n 18 0.9957 0.6613 0.5833 E=(O(1.0000))\n', pattern_name='GeO2-ZnO_700C_60min.xy', num_steps=20, rp=9.96, rpb=19.06, r=10.99, rwp=12.04, rexp=2.69, d=0.1, rho=1.99, phases_results={'GeO2_152_(cod_2300365)-0': PhaseResult(spacegroup_no=152, hermann_mauguin='P3_121', xray_density=4.276, rphase=10.88, unit='NM', gewicht=(0.2793, 0.0012), gewicht_name=None, a=(0.499111, 2.4e-05), b=None, c=(0.564768, 3.3e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 3 0.4512 0.0000 0.3333 E=(GE(1.0000))\n 6 0.3974 0.3022 0.2429 E=(O(1.0000))', k1=0.01, B1=0.005), 'ZnO_186_(cod_9004178)-0': PhaseResult(spacegroup_no=186, hermann_mauguin='P6_3mc', xray_density=5.669, rphase=9.52, unit='NM', gewicht=(0.2266, 0.0021), gewicht_name=None, a=(0.325072, 1.1e-05), b=None, c=(0.520812, 3e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 2 0.3333 0.6667 0.0000 E=(ZN(1.0000))\n 2 0.3333 0.6667 0.3821 E=(O(1.0000))', k1=0, B1=(0.003509, 9.4e-05)), 'Zn2GeO4_148_(cod_9014631)-0': PhaseResult(spacegroup_no=148, hermann_mauguin='R-3', xray_density=4.777, rphase=20.28, unit='NM', gewicht=(0.07959, 0.00075), gewicht_name=None, a=(1.423755, 8.3e-05), b=None, c=(0.952849, 7.9e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 18 0.2150 0.1940 0.5830 E=(ZN(1.0000))\n 18 0.5483 0.8607 0.5837 E=(ZN(1.0000))\n 18 0.2150 0.1940 0.2500 E=(GE(1.0000))\n 18 0.8877 0.4633 0.4293 E=(O(1.0000))\n 18 0.2220 0.1310 0.4030 E=(O(1.0000))\n 18 0.2230 0.1140 0.7500 E=(O(1.0000))\n 18 0.9957 0.6613 0.5833 E=(O(1.0000))', k1=0.01, B1=0.005)})), phases=((RefinementPhase(path=PosixPath('cifs/GeO2_152_(cod_2300365)-0.cif'), params={}), RefinementPhase(path=PosixPath('cifs/GeO2_154_(cod_9007477)-0.cif'), params={})), (RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={}),), (RefinementPhase(path=PosixPath('cifs/Zn2GeO4_148_(cod_9014631)-0.cif'), params={}),)), foms=((0.0,), (0.035359886962825535, 0.03528024374187284), (0.13639804324187943,), (0.33427282595205166,)), lattice_strains=((0.0,), (0.00018588822358559277, 0.00048125125325843384), (0.0005712782487738224,), (-0.0033382277696805732,)), missing_peaks=[], extra_peaks=[])]
In this pattern, we only have one solution found with Rwp = 12.04 %.
for i in range(len(search_results)):
print(f"Rwp of solution {i} = {search_results[i].refinement_result.lst_data.rwp} %")
Rwp of solution 0 = 12.04 %
Each SearchResult has a .visualize() method to visualize the refined pattern and
missing/extra peaks in the solution. If there are no missing or extra peaks, this option
will not appear.
search_results[0].visualize()
You can also view all the alternative phases in one solution from SearchResult.phases attribute.
print("Phases found in solution 0:")
for i, phases_ in enumerate(search_results[0].phases):
print(f" - Phase {i}: {[phase.path.name for phase in phases_]}")
Phases found in solution 0:
- Phase 0: ['GeO2_152_(cod_2300365)-0.cif', 'GeO2_154_(cod_9007477)-0.cif']
- Phase 1: ['ZnO_186_(cod_9004178)-0.cif']
- Phase 2: ['Zn2GeO4_148_(cod_9014631)-0.cif']
From the result, you can see that for the phase GeO2, the algorithm identifies two
similar phases with slightly different spacegroups (152 and 154).